Phonology-Augmented Statistical Framework for Machine Transliteration Using Limited Linguistic Resources

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonology-augmented statistical transliteration for low-resource languages

Transliteration converts words in a source language (e.g., English) into phonetically equivalent words in a target language (e.g., Vietnamese). This conversion needs to take into account phonology of the target language, which are rules determining how phonemes can be organized. For example, a transliterated word in Vietnamese that begins with a consonant cluster is phonologically invalid. Whil...

متن کامل

Statistical Machine Translation: Rapid Development with Limited Resources

We describe an experiment in rapid development of a statistical machine translation (SMT) system from scratch, using limited resources: under this heading we include not only training data, but also computing power, linguistic knowledge, programming effort, and absolute time.

متن کامل

Using Linguistic Knowledge in Statistical Machine Translation

In this thesis, we present methods for using linguistically motivated information to enhance the performance of statistical machine translation (SMT). One of the advantages of the statistical approach to machine translation is that it is largely languageagnostic. Machine learning models are used to automatically learn translation patterns from data. SMT can, however, be improved by using lingui...

متن کامل

Tajik-Farsi Persian Transliteration Using Statistical Machine Translation

Tajik Persian is a dialect of Persian spoken primarily in Tajikistan and written with a modified Cyrillic alphabet. Iranian Persian, or Farsi, as it is natively called, is the lingua franca of Iran and is written with the Persian alphabet, a modified Arabic script. Although the spoken versions of Tajik and Farsi are mutually intelligible to educated speakers of both languages, the difference be...

متن کامل

Transliteration by Bidirectional Statistical Machine Translation

The system presented in this paper uses phrase-based statistical machine translation (SMT) techniques to directly transliterate between all language pairs in this shared task. The technique makes no language specific assumptions, uses no dictionaries or explicit phonetic information. The translation process transforms sequences of tokens in the source language directly into to sequences of toke...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM Transactions on Audio, Speech, and Language Processing

سال: 2019

ISSN: 2329-9290,2329-9304

DOI: 10.1109/taslp.2018.2875269